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@ Arithmetic unit for multiplying long integers modulo IM and R-SJ^ converter provided with such 
multiplication device. 
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© A systolized and modular arittimetic device has 
a control module, folbwed by a series arrangement 
of processing module, followed by a tail module. For 
multiplying an integer P and an integer Q modulo a 
third integer M, a provisional product is incremented 
each time with Q for a -1- bit In P. preceding by a 
doubling of the product. For a -0- bit only the dou- 
bling ensues. Normalizing mod M Is effected by 
adding the complement of M, W. under control of 
propagated carry values. A similar procedure is pro- 
posed for exponentiation of Q.^'F. 
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RELD OF THE INVENTION 

The invention relates to an arithmetic unit for 
multiplying long integers modulo an integer M. 
Generally, arithnnetic units have been In use for 
standard word lengths, such as 8, 16 or 32 bits, 
and thus have been optimized for the actual word 
length. While operating on a word, the result is 
usually provided In parallel which has necessitated 
for complex features such as carry ripples, the 
complexity of such provisions and the response 
time incurred growing quickly with increasing word 
length. The present invention supports arithmetic 
with a dock rate that is independent of the word 
length. 

The design is systolic in that it consists of a 
serial arrangement of a single control nnodule fol- 
lowed by an array of processing modules, all of 
which are simultaneously active. Such a device is 
useful as a converter for the well-known RSA en- 
cryption, see R.L. Rivest et al., A Method for ob- 
taining Digital Signatures and Public-Key Cryp- 
tosystems. Comm. ACM, Vol. 21 (February 1978), 
pages 120-126. The RSA encryption is based on 
the recognition of a triad of integers E, D. M, such 
that for any integer message X < M: pc*{ETD))mod 
M = X. Herein, the asterisk conventionally in- 
dicates a multiplication, the double asterisk an ex- 
ponentiation. Both E and D are smaller than M. In a 
typical field of use, M consists of 512 bits. The 
algorithm is used in public cryptographic systems, 
especially, because the encoding key (E, M) and 
the decoding key (D, M) cannot be practically 
derived from each other. Raising to a particular 
power can be done by repeated multiplication with 
the starting integer X. 

The multiplicatk>n of non-integer messages is 
being considered an obvious extension. If N is the 
number of bits of M, conventionally the complexity 
of the above power raising is of the order of 0(N3), 
which leads to a low conversk>n speed in case of 
execution on a sequential general purpose machine 
even if the tatter is intrinsically fast. 

SUMMARY OF THE INVENTION 

Among other things, it is an object of the 
invention to provide a systolic machine for multiply- 
ing two natural numt)ers P, 0 modulo a third natu- 
ral numt)er M with a machine size of 0(N) sufficient 
to accommodate P, Q, M, and a high tiiroughput 
speed. Conventional solutions have two problems. 
First a multiplication phase followed by a separate 
reduction phase would lead to an intermediate re- 
sult of size 2*N in between the two phases. By now 
unconventionally handling the bits of the control 
operand in the multiplication procedure in order of 
decreasing significance, the operations in the mul- 



tiplication procedure and the operations in the re- 
duction procedure can be interieaved so as to keep 
the intermediate result restricted. The interieaving 
of both procedures leads to a dynamic control 

5 sequence. Therefore an arithmetic unit is needed 
with control signals to indicate which operation is to 
be executed. The second problem is that in such 
an arithmetic unit with large size, both the propaga- 
tion of carries and the broadcasting of the control 

70 signals would prevent a high clock rate. Therefore 
in the systolic solution carries ripple in the direction 
of increasing significance (carry save technique) 
while contarol signals move in the opposite direc- 
tion. 

75 It is the at>ove way of cross-wise propagation 

of the necessary ingredients to the calculation 
(apart from those quantity parts that are locally 
present) that allows the set-up to be described 
hereinafter to attain a throughput speed of 0(1/N). 

20 According to one specific aspect of the invention 
the object is realized by providing a systolized and 
modular arithmetic device for multiplying a first 
multibit integer Q with a second multibit integer P 
okkIuIo a third multibit integer M, said arithmetic 

25 unit comprising a control module followed by a 
series arrangement of processing modules, fol- 
lowed by a tail module, said processing modules 
having modular storage means for storing mutually 
exclusive first bit parts of said first integer Q and 

30 mutually exclusive second bit parts indicating said 
third integer M of pairwise equal significance and 
along said series arrangement of monotonously 
decreasing significance levels away from said con- 
trol module, said control module having presenta- 

35 tion means for as based on successive bit posi- 
tions of said second integer P presenting a control 
bit string for by each control bit so present in a first 
cycle part of each cycle rippling an elerrmntary 
multiplication operation in the k>w significant direc- 

40 tion, through said series arrangement, in a second 
cycle part rippling a carry propagation in the high 
significant direction, in a third cycle part selectively 
upon detecting an egression over said third integer 
rippling a modularizing operation in the low signifi- 
es cant direction and in a fourth cycle part rippling a 
borrow quantity in the high significant direction, 
four successive cycle parts constituting a complete 
cycle associated with a particular bit position of 
said second multibit integer P, and wherein said tail 

50 module has emulating means for emulating dummy 
parts of said first and third integers with respect to 
the low significant end of said series arrangement. 

Although the general set-up works excellent 
according to this general principle, it has t>een 

55 considered better to translate the modularizing into 
an addition with the complement of M. In this 
respect, advantageously, the control module has 
presentation means for serially presenting a control 
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signal string for forward propagation through said 
series arrangement, under control of successive bit 
positions of said second integer according to mo- 
notonously decreasing significance levels, to wit a 
-double- instruction followed by an addQ instruction 
under control of a -1- bit in P. but only a -double- 
instruction urKler control of a zero bit in P, and said 
series arrangement having backpropagation means 
for backpropagating a carry ^gnal after each con- 
trol signal, the control module fonAfard propagates 
an addW instruction for every carry received which 
results effectively in a substraction of M. Advanta- 
geously, the nonmalization value W is the com- 
plement of said third integer M (W= 2~N-M, where 
N is the size of M. i.e. 2-(N-1) -1 < M < 2**N). 

Advantageously, each processing module op- 
erates on only a single bit significance level. This 
provides an extremely simple layout of the mod- 
ules, wftereas the granularity allows for using only 
the number of modules needed. AHematively. mod- 
ules could operate on a succession of bit signifi- 
cance levels, such as 2, 4 or 8. Within each mod- 
ule, the operation may be conventional, so that the 
module operates faster than would be able with the 
corresponding single-bit-modules. Between succes- 
sive modules, the inventive procedures and hard- 
ware provisions would be presented. This would 
still allow for enough flexibility in most cases, while 
still keeping the design of the modules relatively 
simple. 

The invention also relates to a device for ex- 
ponentiation of an integer X to a power E modulo 
said third integer M, wherein said series arrange- 
ment has control means for k>ading said integer X 
as representing said first integer Q and said control 
module has second control nrteans for activating 
said presentation means for presenting various 
powers of said integer X as representing said sec- 
ond integer P, and wherein furthermore recycling 
means are provided to recycle a preliminary prod- 
uct back to said control module as representing a 
subsequent value of said second integer P. Ex- 
ponentiation is a sequefK:e of multiplications, 
wfierein the two factors are buiit from ttie original 
integer X. In an elementary set-up. only the prod- 
uct is recycled and multiplied by X, so that the 
exponent is raised by one. 

For faster operation, however, either the pre- 
liminary product is recycled and multiplied with 
itself, followed by a multiplication by X, or only the 
multiplication with itself is executed, as based on 
whether the exponent bit in question is 1 or 0. 
respectively. The only necessary provision is that 
the storage capacity per module is sufficient, inas- 
much now always the one factor Q. the product 
actually being formed, and the preliminary product 
of the preceding multiplication must be stored. For 
example, for E = 11 (binary 1011), the first bit pro- 



duces X. the second XJ(=X2, the third 
X2J(2X=X5, the fourth ^^y^X^X^K Note that the 
first bit in fact does XO.XO.X=X 

Various advantageous aspects are recited in 
5 dependent Claims. 

BRIEF DESCRIPTION OF THE FIGURES 

A prefenred embodiment of the invention will 
10 be explained in particular with respect to the ac- 
companying Figures and Tables, wherein first the 
general setup is explained with respect to an em- 
bodiment, next an embodiment of the control mod- 
ule, thereafter an embodiment of the processing 
15 module. 

Now, Rgures 1. 2. 3a. 3b are various symbolic 
diagrams of multiplication/exponentiation devices 
while also illustrating a ^-bit multiplication cum 
normalization modub M. 
20 Tables 1-12 specify various operations and 
definitions. 

DESCRIPTION OF A PREFERRED EMBODIMENT 

25 Figure 1 is an elementary block diagram of an 
exponentiation/multiplication device according to 
the invention. Herein, the device is conceptually 
decomposed into a control module R and an array 
of processing modules symt)olized by a block SAm. 

30 wherein N is the number of modules. The control 
module receives from the outer worid. such as a 
higher level host machine, signals H{n, Vin and 
outputs thereto through Vout- Towards the array of 
processing modules the control module outputs 

35 signal Fput and receives therefrom signals C|„, P|„. 
Likewise, the array of processing modules receives 
signals Fm and outputs signals Cout* Pout back to 
the control module. Between the processing mod- 
ules the interface is ttte same as the one between 

40 control module and the string of processing mod- 
ules. 

The setup shown is detailed hereinafter for the 
exponentialization. Multiplication is a subcomputa- 
tion of exponentiation. Now, tf>e data available in 

45 the control nnodule consists of E. ttie exponent 
value that is smaller than M, and two counters nO. 
n1 in the range [O.N] (where N is the numfc)er of 
bits of M). The array of processing modules must 
now calculate p< ** E) mod M. To this effect the 

50 array of processing modules stores the following 
quantities: 

W = 2^-M arKi X the quantity that is to be raised 
to exponent E. Moreover the processing modules 
have means to store the intermediate results P.Q 
55 and r. More particulariy, Q is an intenmediate result 
of exponentiation, whereas r is an intermediate 
result of the multiplication of the operands P and Q 
(where P is either X or 0). 
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DESCRIPTION OF THE OPERATIONAL PROCESS 

Table 1 specifies the types of various quan- 
tities figuring in the method. The phase is a binary 
value. At the first value the key Is loaded. At the 
second value the message blocks are loaded and 
converted and finally output. The type bit is stan- 
dard binary. The type carry is a three-valued carry 
quantity. The type instruction may have any of 
eleven values, each named as according to the list 
thereafter. In particular, to be discussed hereinafter 
in detail, -store- means transferring from the stor- 
age space reserved for quantity r to the storage 
space reserved for the quantities W.QX and P 
(storeW, storeO, storeX. storeP). The instruction 
may be coded in a conventional way such as a four 
bit quantity. Non-used values may be designed to 
control special functions, such as test. 
The instruction k>adX transfers infonnnation from X 
to r. Next, the channel types are specified. Hin is 
the extemally controlled phase, Vm, VcMt. Pin 
binary channels. Cm Is a channel of type carry, and 
Fout for outputting control values of type instruction. 
The control module receives carries through input 
port Cin. In the communication between the control 
module and the array of processing modules the 
communications through the instruction channel 
Fout and the canry channel Qn attemate. This sec- 
tion, only considers the operation of the control 
module on a functional level. On a hardware level, 
mapping of those functions on elements of a pro- 
grammed microprocessor would be conventional. 
Alternatively, a translation to dedicated special pur- 
pose hardware would be obvious to the skilled art 
technk^ian. Generally, several modules could be 
realized on an integrated circuit Alternatively, the 
whole devk:e could be computer-assisted-designed 
into a single integrated circuit. 

Now, for the control module in particular. Table 
2 gives the variable declaration and the principal 
execution loop by the control module. There is 
declared a variat>le h of type -phase-, two carry 
values car and c of type carry, two count values of 
type (0..N), a bit value b and an array E of N bits 
containing the exponent. 

The infinite loop waits for the phase control 
signal which it receives through channel Hin >n 
variable h. If the phase value "loadkey" (actually a 
binary value) is received, the conversion key is 
loaded. TTie key, in particular consists of quantities 
E, W, to wit, the exponent and the complemented 
modulo value M. First the exponent is loaded In the 
bit array E, then the value W is loaded in the 
distributed variable r(r is distributed over the pro- 
cessing modules). Sut>sequentiy this value is 
stored in distributed variable W by forwarding the 
instruction storeW to the anray of processing mod- 
ules. The StoreW instruction is followed by a carry 



communication from the processing modules to the 
control module. 

On the other hand, if the received phase in- 
dicates "convert" the message block X is loaded in 

5 variable r and sut)sequentiy stored In X by forward- 
ing the instruction storeX and waiting for the carry 
value. Sut)sequentiy, the procedure 
X_exp_E__mod M is called, which yields a val- 
ue that is smaller than 2N. Since the result must t»e 

10 smaller than M, the normalize operation is ex- 
ecuted which yields an equivalent value (modulo 
M) smaller than M. Rnally the result is output. The 
exponentiation and normalization effectively are 
done by the series of processing modules to be 

75 detailed hereinafter. Inasmuch as h may have only 
two different values always one of the phases pre- 
vails. 

In tfie conti-ol module, via V|„, Vout and Pm 
always bit-serially blocks of N bits are commu- 

20 nicated while starting with the most significant bit. 
Table 3 gives the realizations of the two operations 
load_exponent and load_r. In load_exponent, 
count nO Is initialized to N. Thereafter, the system 
loops in that for each bit position, the bit on signal 

25 V|„ Is awaited and stored in an array E. In operation 
loadr, the variable r is first initialized to zero by 
issuing the instruction setO and thereupon, the car- 
ry c is awaited on Qn- Next, the count nO is set to 
N. Thereafter the control module executes a loop in 

30 which for each bit position, first, the bit on port 
is received and the count decremented. Next, the 
value of r is doubled by issuing the instruction 
mul2 arKi awaiting canry c from Cm- f the bit 
received through Vm is equal to zero the iteration 

35 step Is ready. If the received bit is equal to 1 , the 
value of r is incremented by one by issuing the 
instruction addl and once more receiving a canry 
from Qn- After N such steps r has the value re- 
ceived through port Vm. 

40 The procedure used to compute (X^ mod M 
is based on the algorithm shown In Table 6. The 
exponentiation maintains the invariant 
(Cr2-ti) * X-{E mod 2^) = X^. RrstQ is set to 1 
and n Is set to N. Now, the most significarrt zero 

45 bits in the exponent are skipped. Thereafter, as 
bng as n Is positive, the following loop is executed. 
Rrst, n is decremented and Q is squared. Next, if 
the current bit in the exponent equals 1, Q is 
multiplied by X; otherwise 0 is not modified. 

50 Likewise, Table 7 gives the algorithm for mul- 
tiplication, i.e. for computing (P*Q). The computa- 
tion maintains invariant (P mod 2**n) * Q + r * (2**n) 
- P * Q. In this case variable r is set to zero; and 
variable n is set to N. Now, as long as n is positive, 

55 first, n Is decremented and r is doubled. Next, 
depending on whk^h of the two guards Is true, 
either no operation is effected, or Q is added to r. 
Table 4 shows the procedure 
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X__exp_E mod M as executed by the control 

module. The distributed variable Q is set to 1 by 
issuing three Instructions. Rrst, the Instruction setO 
is output and carry c awaited, this makes r equal to 
zero. Next, instruction addl is outputted and carry 
c awaited. Thereafter, instruction storeQ Is output- 
ted and the carry awaited. The combination of 
these three operations sets variable Q to 1. Next 
counter nO is initialized to N. Thereafter the most 
significant zero bits In the exponent are skipped. 
The remaining bits of the exponent are handled by 
executing a loop In which each step starts in a 
state in which the variables r and Q contain the 
same value. In each step first the procedure 

mul__Q__mod M Is executed, which operation Is 

specified hereinafter. This procedure. In particular, 
assigns to both r and Q the value r ' Q mod M. 
Therefore, since r was equal to Q at the beginning 
of the procedure execution, the procedure has 
squared Q modulo M. Next, for zero value of the 
exponent bit pointed to by nO, the iteration step Is 
completed. For a one value of the exponent bit in 
question, first, the variable r Is set to X by issuing 
the Instruction loadX and subsequently receiving a 
carry. Subsequently the procedure 

mul_Q__mod M is executed, which results in the 

computation of CTX modulo M. 

Table 5 specifies the procedure 

mul__Q_nM)d M. In the power raising, P Is an 

auxiliary variaible for the multiplication. Rrst, the 
value In r Is stored in variable P by issuing the 
instruction storeP and sut>sequently receiving a 
carry. Next, the instruction setO is output and the 
carry c Is awaited. Now, the quantities car and b 
are made zero, and count n1 Is set equal to N. It 
should be noted that nO sequences along the bits 
of the exponent value, whereas n1 sequences 
along the bits of multiplication factor P. Next, a 
loop is executed as long as either the car value, or 
b or n1 is positive {car>0 or b = 1 or n1>0). In each 
loop step one out of three commands is executed. 
If the car value is positive (car>0), the value of car 
is decreased by one and an addW instruction is 
issued. Both operations together effectuate a sub- 
traction by M, since a carry represents the value 
2rU and M=2TM-W. If the car value is zero and b 
equals one (car=0 and b = 1), b is set to zero and 
the addQ instruction is output. If both car and b are 
zero (car=:0 and b = 0), n1 is positive (follows from 
loop condition) and therefore the next bit of factor 
P can be received and count nl decremented by 
one. Next, the mul2 instruction is output. After 
Issuing one of the three instructions, the carry c is 
awaited and added to the value of car. In this way 
all bits of factor P are treated in succession. At the 
end of the loop r Is equivalent to P*Q modulo M, 
but It is not necessarily smaller than 2**N. There- 
fore, a new loop Is entered in which all possible 



carries in the processing modules are propagated 
towards the control module. Rrst count nl is again 
set to N. The loop continues as long as the value 
of eitfier car or nl is positive. If car is positive, car 

5 Is decremented, count nl is set to N. and instruc- 
tion addW is output. If car is equal to zero (which 
implies that nl is positive), count nl is decremen- 
ted and the Ident Instruction is output. The instruc- 
tion Ident does not modify the value of r, it only 

10 serves to propagate carries. After issuing one of 
the two commands, carry c is awaited and added 
to the value car. 

After the loop the value of r is smaller than 2**N. 
The value of r is then stored in Q by issuing the 

?5 StoreQ instruction and receiving a carry. 

After the exponentiation the value of r is nor- 
malized. Table 8 details the normalizing operation, 
which assigns to variable r the value r mod M. 
Initially, r is smaller than 2^ and after nomnaliza- 

20 tion r is even smaller than M. Initially the value of r 
is also present in Q. Rrst the value of W is added 
to r by issuing the addW instruction and subse- 
quently propagating all carries. If a carry is re- 
ceived, the original r was larger than M and the 

25 updated value is r-M. If on the other hand, no carry 
is received, the old value of r, still t)elng present in 
Q. was correct and should be output. So, the 
operation starts with sending the instruction addW 
and awaiting a carry. Count nl is set to N. Now. as 

30 long as c=0 and nl is non-zero, the ident instruc- 
tion is output, the counter n1 is decreased, and a 
carry value received. Upon leaving the loop one of 
the following conditions holds. If no carry Is re- 
ceived (c =0). the value of Q is restored in r by first 

35 issuing the instruction setO and receiving a carry 
value and subsequently issuing the instruction 
addQ and receiving a carry value. If on the other 
hand, the carry is equal to one. the updated value 
of r is correct and is left unrrKKjified. Rnally. the 

40 result is output as indicated in Table 9. Rrst, the 
value of r is stored in the shift register by issuing 
the StoreP instruction and awaiting a carry value. 
Next, count nO is set equal to N. In the next 
following loop, as long as nO > 0, a bit is received 

45 from Pin and sut>sequentiy output through Vom. 
after which nO is decremented. 

Rgure 2 more specitically shows the systolic 
design of SAn. The design is recursive in that 
module SAn + 1 consists of an arithmetic cell A, a 

50 shift register cell S. and a similar but smaller mod- 
ule SAn. If n is positive, module SAn again consists 
of a p>air of A/S cells and a module SAn-1. Rnally 
there is a tail cell SAO, which does not effectively 
execute any arithmetic. 

55 Table 10 exemplifies the function of tail cell 
SAO. It continuously loops while receiving Instruc- 
tion f, and depending on the nature of this Instruc- 
tion, outputs either a 1 or a 0 on its carry output 
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Cout. SAO never communicates through port Pout- 
Table 11, likewise, shows the functions of shift 
register cell S. It continuously loops and in each 
loop step it first waits for a bit either through port in 
from the conresponding arithmetic cell (load opera- 
tion) or through port Pin from its right-hand neigh- 
txiur (shift operation). The bit received is subse- 
quently output to its left neighbour through port 
Pout- In fact, the cell is a one bit shift register with a 
multiplexed input. Note that at any time at most 
one of the two input ports offers a bit. 

Likewise, Table 12 shows the functions of 
arithmetic cell A .It has a variable c of type carry, a 
variable carry with a value range 0..4; a variable f 
of the type instruction, and three bits w, q, x. Now, 
first the cell awaits an instruction from input Fm; at 
reception, variables car, c are reset to zero. Next, a 
loop is executed where the the instruction value 
determines which operation is executed. For the 
eleven instructions listed eariier the following oper- 
ations are periormed: setO sets car to zero, addl 
retains the value of car. mul2 multiplies car by two. 
ident retains the value of car. addW adds w to car. 
addQ adds q to car. loadX assigns to car the value 
of X, storeW assigns to w the least significant bit of 
car (car mod 2). storeQ assigns to q the value (car 
mod 2), storeX assigns to x the value (car mod 2). 
storeP outputs the value (car mod 2) through out- 
put port out. Next, the instruction f is output 
through output port Fout; and the two most signifi- 
cant bits of car are output through Cout. Then, the 
next instruction is received via Rn and a carry 
value via Qn- The carry value received is added to 
the least significant bit of car and then the next 
loop step is executed. 

The systolic design is extreme in that each 
processing module contains one bit of the values 
P, Q artd W. Such a solution is fast, but make great 
demands on area and power consumption, since 
such a bit-slice architecture requires the maximum 
numt>er of cells to construct the multiplier. If such a 
high speed is not required, a slower but cheaper 
design can be obtained by irK^reasing the slice 
size. 

For completeness* sake, Rgures 3a, 3b give a 
complete example of multiplying a threebit integers 
P = 4(100) by Q=5(101) modulo M. represented by 
W = 2 (010). Control module RS. processing mod- 
ules H2. HI, HO have been shown, together with 
the quantities they contain. In Rgure 3b. the first 
column shows 37 states, which each conrespond 
either to the operation in a cycle part (indicated by 
arrows), together with the quantities transferred, or 
by the situation after such transfer and the opera- 
tion ensued. In consequence, a sequence of 8 lines 
corresponds to one cycle. The quantity n is the 
number of bits awaiting processing, true and false 
have been abbreviated as t, f, respectively, d is the 



binary produced by the tail module emulating a 
dummy quantity, id (tderii), aw (addw), aq (addq). 
db (double) are self explanatory. On line 37 the 
control module finds that d=true, indicating the 

s end. Given the bit length of the two multiplicands, 
this could come no later than line 37. so it may as 
well be determined by counting. Dependerrt on tf)e 
various quantities W, P. Q. the end could also 
come earlier, so the embodiment shown is some- 

70 what faster, under particular circumstances. 

Claims 

1. A systolized and modular arithmetic device for 

75 multiplying a first multibit integer Q with a 

second multibit integer P moduk) a third mul- 
tibit integer M, said arithmetic unit comprising 
a control module followed by a series arrange- 
merrt of processing modules, followed by a tail 

20 module, said processing modules having mod- 

ular storage means for storing mutually exclu- 
sive first bit parts of said first integer Q and 
mutually exclusive second bit parts indicating 
said third integer M of p>airwise equal signifi- 

25 cance and along said series arrangement of 

monotonously decreasing significance levels 
away from said control module, said control 
module having presentation means for present- 
ing a control bnt of a control bit string in a first 

30 cycle and receiving means for receiving a car- 

ry value in a second cycle part, and wherein 
said processing modules have means to re- 
ceive simultaneously a contarol bit from said 
control bit string from their respective more 

35 significant neighbour and a carry value from 

ft\e\r respective less significartt neight>our in a 
first cycle phase, as well as presentation 
means for presenting both a carry value to 
their respective more significant neighbour and 

40 the control bit of said control bit stiing to their 
respective less significant neighbour in a sec- 
ond phase, and wherein each processing mod- 
ule operates half a cycle out of phase with 
respect to its neighbours, and wherein said tail 

45 module has emulating means for emulating 
dummy parts of said first and third integers 
with respect to the low significant end of said 
series arrangement. 

50 2. A device as claimed in Claim 2, wherein said 
control module has presentation means for se- 
rially presenting a control signal string for for- 
ward propagation through said series arrange- 
ment, under control of carries received as well 

55 as successive bit values of said second integer 

according to monotonously decreasing signifi- 
cance levels, to wit a double instruction fol- 
lowed by an addQ instruction under control of 
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a -1- bit in P, but only a double instruction 
under control of a zero bit In P, and said series 
arrangement having backpropagation means 
for back propagating a carry signal after each 
control signal, the control . module forward s 
propagates an addW instruction for every carry 
received which results effectively in a sub- 
straction of M. 

3b A device as claimed In Claim 1 or 2, wherein io 
each processing module operates on a single 
bit significance level. 

4. A device as claimed in Claim 1. 2 or 3, 
wherein said control module has detection 75 
means for after presentation of ail bit signals of 

said second integer detecting arrival of an ur>- 
changed dummy signal from said tail module 
signalling that no carry or non-dummy control 
signal is being propagated in said series ar- 20 
rangement for so signalling availability of a 
modularized product. 

5. A device as claimed in Claim 1, 2 or 3, 
wherein said control module has count means 25 
for after presentation of all bit signals of said 
second integer counting said cycles for upon 
reaching a particular count as determined by 

the length of said series arrangement signal- 
ling that any carry or non-dummy control sig- 30 
nal had been rippled out for thereby signalling 
availability of a modularized product. 

6. A device as claimed in Claim 1 for exponer>- 
tiation of an integer X to a power E modulo 35 
sakl third integer M, wherein said series ar- 
rangement has control means for k)ading sakJ 
integer X as representing said first integer Q 

arid saki control module has second control 
means for activating said presentation means 40 
for presenting various powers of said integer X 
as representing sakl second integer P, arKi 
wherein furthermore recycling means are pro- 
vided to recycle a preliminary product back to 
said control module as representing a subse- 45 
quent value of said second integer P. 

7. A device as claimed in Claim 6, said control 
module having selection means for under con- 
trol of successive bits in said exponent, either so 
squaring said preliminary product, or squaring 

said preliminary product, followed by multiply- 
ing the new preliminary product by said in- 
teger X. 
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1. 3, 0, 0 true id, 0, true id, 0 true id, 0, true 

2. db 

3. 2, 1, 0, false db, 0, false 

4. O.f 

5. 2, 1, 0, f"'~~db, 0, f db, 0, f 

6. aq_ 0,f db^ 

7. 2, 0, 0, f aq, 1, f db, 0, f db, 0, f 

8. 0,f aq O.f 



db _ 

O.t 



9. 2, 0, 0, f aq, 1, f aq, 0, f db, 0, f 

10. db^ ^ 0,f aq^ 

11. 1, 0, 0, f db, 2, f aq, 0, f aq, 1, f 

12. 1,f db ^ 0,f 

13. 1, 0, 1, f"'~cib, 0, f db, 0, f aq, 1, f 

14. aw^ ^ O.f db^ 

15. 1, 0, 0, f aw, 0, f db, 0, f db, 2, f 

16. 0,f aw^ 1,f 

17. 1, 0, 0, f aw, 0, f aw, 2, f db, 0, f 

18. db 1,f aw 0,t 



aq^ 

,0.t 
db 



19. 0, 0, 0, f db, 2, f 

20. 1,f 

21. 0, 0, 1, f"^db, 0, f 

22. aw^ 

23. 0, 0, 0, f aw, 0, f 

24. 0,f 

25. 0, 0, 0, f aw, 0, f 

26. Jd^ 

27. 0, 0, 0, f Id, 0, f 

28. 0,f 

29. 0, 0, 0, f"^ld, 0, f 

30. jd^ 

31. 0, 0, 0, f id, 0, f 
32 0,f 

33. 0, 0, 0, f^—id, 0, f 

34. Jd^ 

35. 0, 0, 0, f id, 0, t 

36. 0^ 

37. 0, 0, 0, t Id, 0, t 



aw, 0, f aw, 0, f 
db 0,i aw^ 

*db, 0, f aw, 0, f 
0,f db_ 0,t 

db, 0, f db, 0, f " 
aw J),f db^ 

aw, 1, f db, 0, f 
0,f aw^ j O,t 

aw, 1, f aw, 0, f 
id^ 0 ,f aw ^ 

Id, 1, f "^w, 0, f 
0,f jd^ p,t 

^d, 1,f id,0,t 
id^ 0,t id ^ 

id, 1,t id, 0, t 
0,t O.t^ O.t 

*~rd, 1, t id, 0, t 
id^ 0,t id ^ 

id, 1, t id, 0, t 
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typ e phase {load key, convert} 
bH (0..1) 
carry (0..2) 

Instruction {setO, addl, mul2, ident, addW, 

addQ, loadX, storeW, storeQ, storeX, 
storeP} 

Hh, : phase 

V,., C^, : bit 

Fo^: instruction 

Table 1 



Control module R 
var h:= phase; 

car, c: carry; 

no, n, (0..N) 

b: bit 

E: array [0..N-1] of bit 



do true H^,? h; 
If h = loadkey 

n h = convert 
fi 

od 

Table 2 



load_exponent; 
load r; 

Fo^! storeW; C^.? c 

ioad_r; 
Fo«,l storeX; 0^,? c 
X_exp_E_mod_M; 
Normalize; 
output 
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load_exponent: no=N; 

do Ro > 0 VJ} b; 

no:=no-1; E[n,]:=b 

od 

ioad_r : F! setO; C^i? c 
no:=N; 

do n, > 0 Vfc? b; no:=no-l; 

F^! mul2; C^,? c; 

If b=0 skje 

D b=1 F^! addl; C^.? c 

fi 

od 

Table 3 



X_exp_E_mod_M: 
F^! setO; C^? c; 
F^! addl; C^? c; 
F.^! storeQ; C^7 c; 
no:=N; 

do E [n.-1]=0 no:=no-1 od; 
do n, > 0 no:=n,-1; 

F^! setO; CJf c 

r=Q 

F^! addQ; Q.? c 
mul Q_mod_M; 
if E~[nj=0 skip 

E [nj=1 F^! loadX; C^.? c; 
mul_Q_mod_M 

fi 

od 

Table 4 
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mul_Q_mod_M: 

f^l storeP; Ci? c; 

FoJ setO; CI? c; 

can=0; b:=0; n^r^N; 

do car > 0 or b=1 or n, >0 

if car > 0 car:=car-1; FJl addW 
Q car=0 and b=1 b:=0; FJi addQ 
□ car=0 and b=0 P^,? b; n,:=n,-1; 
F^l mul2 

CJt c; car:= car+c 

od : 
n,:=N 

do car > 0 or n, > 0 

If car> 0 car:=car-1; n^:=N; F^l addW 
□ car=0 n,:=nr1; Fo„,! ident 

fi; 

Ci„7 c; car:=car+c 
od : 

F^! storeQ; CJf c 
Table 5 



Exponentiation: 
Q:=1; n:=N; 

do E [n-11=0 n:=n-1 ad; 
do n > 0 n:=n-1; 

Q:= Q*Q; 

If E [n]=0 sklB 
E [nl=1 Q;=Q*X 

fl 

od 

Table 6 
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Multiplication: 

r.-O; n:=N; 
do n > 0 n:=n-1; 
r=2*r 

if P[n]=0 skiB 
P[n]=1 r=r+Q 

fl 

od 

Table 7 



Normalize: 

F^! addW; C? c; 

n,:=N; 

do c=0 and n, > 0 n,:=n,-1 

F^! ident; C,.? c 

od: 

if c=0 F^! setO; CJlc 
FJt addQ; C?c 
□ c=1 sidp 
fi 

Table 8 



Output: 

F«,! storeP; C»,? c; 
no:=N; 

do n« > 0 PJf b; MJl b; no:=no-1 od 
Table 9 
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Tail cell SA,: 
do true F^? f ; 

]f f =acld1 CJl 1 
0 ff = addl CJl 0 
fl 

od 
Table 10 



Shift register cell S: 
do true !f in? b 
P^? b 

fi; 

Po-lb 

od 

Table 11 



Arithmetic cell A 

var c: carry; car:(0..4); f: instruction 

w,qpc : bit 
FJt f; can=0; c:=0; 

do true If f=setO car=0 

Q f=add1 can=car 

□ f=mui2 can=2*car 
Q f=ident car:=car 

0 f=addW car:=car+w 

Q f=addQ car:=car+q 

□ f=loadX car=:x 

□ f=storeW w:=(car mod 2) 

□ f=storeQ q:=(car mod 2) 

□ f=storeX x:=(car mod 2) 

□ f=storeP out! (car mod 2) 

fl; 

F«.l f; C^! (car diy 2) 

Ffci? Cta? c; car: = (car mod 2)+c 



Table 12 
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